L 16 : Counting Triangles in MapReduce

نویسنده

  • Meysam Taassori
چکیده

These days, global pool of data is growing at 2.5 quintillion byte per day and more than 90 percent of this huge pool of data has been produced in the last two years alone [1]. The era of big data has arrived. After [2] explained the file system of Google in this way such that files are split in to various chunks stored in a redundant fashion on a cluster or commodity machines, most of research groups paid attention to big data as a new field of research. Next step was Map Reduce[3] which bound big data with “Map Reduce”. After Google introduced Map Reduce in 2004, this new style became synonym with Big Data. Google uses Map Reduce in order to calculate the search indices. In fact, they have the results of search sitting in their clusters and everyday they run Map Reduce to recalculate everything.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MapReduce vs. Pipelining Counting Triangles

In this paper we follow an alternative approach named pipeline, to implement a parallel implementation of the well-known problem of counting triangles in a graph. This problem is especially interesting either when the input graph does not fit in memory or is dynamically generated. To be concrete, we implement a dynamic pipeline of processes and an ad-hoc MapReduce version using the language Go....

متن کامل

Colorful Triangle Counting and a MapReduce Implementation

In this note we introduce a new randomized algorithm for counting triangles in graphs. We show that under mild conditions, the estimate of our algorithm is strongly concentrated around the true number of triangles. Specifically, if p ≥ max ( log n t , log n √ t ), where n, t, ∆ denote the number of vertices in G, the number of triangles in G, the maximum number of triangles an edge of G is cont...

متن کامل

Counting Triangles in Parallel An Application of Pipelining

The usual approach to producing a parallel solution to a computational problem is to find a way to use the Divide & Conquer paradigm in order to have processors acting on their own data so they can all be scheduled in parallel. MapReduce is an example of this approach. We present an alternative program schema that can exploit dynamic pipeline parallelism without having to deal with replication ...

متن کامل

Comparing MapReduce and Pipeline Implementations for Counting Triangles

A common method to define a parallel solution for a computational problem consists in finding a way to use the Divide & Conquer paradigm in order to have processors acting on its own data and scheduled in a parallel fashion. MapReduce is a programming model that follows this paradigm, and allows for the definition of efficient solutions by both decomposing a problem into steps on subsets of the...

متن کامل

Approximate Triangle Counting

Triangle counting is an important problem in graph mining. Clustering coefficients of vertices and the transitivity ratio of the graph are two metrics often used in complex network analysis. Furthermore, triangles have been used successfully in several real-world applications. However, exact triangle counting is an expensive computation. In this paper we present the analysis of a practical samp...

متن کامل

Large Scale Graph Mining with MapReduce: Counting Triangles in Large Real Networks

In recent years, a considerable amount of research has focused on the study of graph structures arising from technological, biological and sociological systems. Graphs are the tool of choice in modeling such systems since they are typically described as sets of pairwise interactions. Important examples of such datasets are the Internet, the Web, social networks, and large-scale information netw...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013